135 research outputs found
Improving Few-shot and Zero-shot Entity Linking with Coarse-to-Fine Lexicon-based Retriever
Few-shot and zero-shot entity linking focus on the tail and emerging
entities, which are more challenging but closer to real-world scenarios. The
mainstream method is the ''retrieve and rerank'' two-stage framework. In this
paper, we propose a coarse-to-fine lexicon-based retriever to retrieve entity
candidates in an effective manner, which operates in two layers. The first
layer retrieves coarse-grained candidates by leveraging entity names, while the
second layer narrows down the search to fine-grained candidates within the
coarse-grained ones. In addition, this second layer utilizes entity
descriptions to effectively disambiguate tail or new entities that share names
with existing popular entities. Experimental results indicate that our approach
can obtain superior performance without requiring extensive finetuning in the
retrieval stage. Notably, our approach ranks the 1st in NLPCC 2023 Shared Task
6 on Chinese Few-shot and Zero-shot Entity Linking.Comment: Accepted to NLPCC202
Dialogue State Induction Using Neural Latent Variable Models
Dialogue state modules are a useful component in a task-oriented dialogue
system. Traditional methods find dialogue states by manually labeling training
corpora, upon which neural models are trained. However, the labeling process
can be costly, slow, error-prone, and more importantly, cannot cover the vast
range of domains in real-world dialogues for customer service. We propose the
task of dialogue state induction, building two neural latent variable models
that mine dialogue states automatically from unlabeled customer service
dialogue records. Results show that the models can effectively find meaningful
slots. In addition, equipped with induced dialogue states, a state-of-the-art
dialogue system gives better performance compared with not using a dialogue
state module.Comment: IJCAI 202
Cross-lingual Prompting: Improving Zero-shot Chain-of-Thought Reasoning across Languages
Chain-of-thought (CoT) is capable of eliciting models to explicitly generate
reasoning paths, thus promoting reasoning accuracy and attracting increasing
attention. Specifically, zero-shot CoT achieves remarkable improvements in a
wide range of reasoning tasks by simply instructing the LLM with the prompt
"Let's think step by step!". Despite the success of zero-shot CoT, the existing
zero-shot prompting techniques remain limited to a single language, making it
challenging to generalize to other languages and hindering global development.
In this work, we introduce cross-lingual prompting (CLP), aiming to improve
zero-shot CoT reasoning across languages. Specifically, CLP consists of two
main components: (1) cross-lingual alignment prompting and (2) task-specific
solver prompting. The cross-lingual alignment prompting is responsible for
aligning representations across different languages, whereas the task-specific
solver prompting is used to generate the final chain of thoughts and results
for the reasoning task. In addition, we further introduce cross-lingual
self-consistent prompting (CLSP) to ensemble different reasoning paths across
languages. Our experimental evaluations on several benchmarks demonstrate that
CLP and CLSP significantly outperform the existing prompting methods and
achieve state-of-the-art performance. We hope this work will inspire further
breakthroughs in cross-lingual CoT.Comment: Accepted at EMNLP2023 Main Conferenc
DCR-Net: A Deep Co-Interactive Relation Network for Joint Dialog Act Recognition and Sentiment Classification
In dialog system, dialog act recognition and sentiment classification are two
correlative tasks to capture speakers intentions, where dialog act and
sentiment can indicate the explicit and the implicit intentions separately.
Most of the existing systems either treat them as separate tasks or just
jointly model the two tasks by sharing parameters in an implicit way without
explicitly modeling mutual interaction and relation. To address this problem,
we propose a Deep Co-Interactive Relation Network (DCR-Net) to explicitly
consider the cross-impact and model the interaction between the two tasks by
introducing a co-interactive relation layer. In addition, the proposed relation
layer can be stacked to gradually capture mutual knowledge with multiple steps
of interaction. Especially, we thoroughly study different relation layers and
their effects. Experimental results on two public datasets (Mastodon and
Dailydialog) show that our model outperforms the state-of-the-art joint model
by 4.3% and 3.4% in terms of F1 score on dialog act recognition task, 5.7% and
12.4% on sentiment classification respectively. Comprehensive analysis
empirically verifies the effectiveness of explicitly modeling the relation
between the two tasks and the multi-steps interaction mechanism. Finally, we
employ the Bidirectional Encoder Representation from Transformer (BERT) in our
framework, which can further boost our performance in both tasks.Comment: Accepted by AAAI2020 (Oral
- …